A Weighted Overlap Add-based Front-end for Speech Recognition

نویسندگان

  • S. M. Ahadi
  • H. Sheikhzadeh
  • R. L. Brennan
  • G. H. Freeman
چکیده

Speech signal enhancement is frequently referred to as a preprocessing step to speech recognition. However, in practice, this cannot be easily accomplished since the front-end signal processing techniques and/or parameters used in these two frequently differ. We apply a signal processing technique successfully used in speech enhancement to speech recognition and show that it can perform equally well compared to well-known speech recognition front-ends such as MFCC. The technique, oversampled filterbank analysis/synthesis through weighted overlap add (WOLA), has been tested and performed satisfactorily on the TI-46 and Aurora tasks in both clean and noisy conditions and also in subband speech recognition. The results indicate the capability of this technique in reducing the front-end signal processing blocks of enhancement and recognition into a single block.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A low-resource, miniature implementation of the ETSI distributed speech recognition front-end

The purpose of this work is to demonstrate that distributed speech recognition front-ends can be deployed in environments which provide for very little power and CPU resources, with possibly no degradation of speech recognition quality when compared to standard floatingpoint implementations. The ETSI distributed speech recognition front-end standard is implemented on an ultra low-power miniatur...

متن کامل

A Low-resource, Miniature of the Etsi Distributed Speech R

The purpose of this work is to demonstrate that distributed speech recognition front-ends can be deployed in environments which providefor very little power and CPU resources, with possibly no degradation of speech recognition quality when compared to standard floatingpoint implementations. The ETSI distributed speech recognition front-end standard is implemented on an ultra low-power miniature...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

A Noise-Robust ASR Back-end Techniqu Recognition

The performance of speech recognition systems trained in quiet degrades significantly under noisy conditions. To address this problem, a Weighted Viterbi Recognition (WVR) algorithm that is a function of the SNR of each speech frame is proposed. Acoustic models trained on clean data, and the acoustic front-end features are kept unchanged in this approach. Instead, a confidence/robustness factor...

متن کامل

Interpolate to Enhance for NonStationary Signal Processing

Enhancing non-stationary signals is crucial for many applications, such as speech recognition, audio communication, and bio-signals analysis. The present paper investigates a novel processing structure (alternative to the overlap-add scheme), based on an interpolated zero-phase FIR filtering. The proposed structure accounts for slow signal non-stationarity, and also natively supports time and f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005